Search CORE

10 research outputs found

Gradient-trained Weights in Wide Neural Networks Align Layerwise to Error-scaled Input Correlations

Author: Boopathy Akhilan
Fiete Ila
Publication venue
Publication date: 15/06/2021
Field of study

Recent works have examined how deep neural networks, which can solve a variety of difficult problems, incorporate the statistics of training data to achieve their success. However, existing results have been established only in limited settings. In this work, we derive the layerwise weight dynamics of infinite-width neural networks with nonlinear activations trained by gradient descent. We show theoretically that weight updates are aligned with input correlations from intermediate layers weighted by error, and demonstrate empirically that the result also holds in finite-width wide networks. The alignment result allows us to formulate backpropagation-free learning rules, named Align-zero and Align-ada, that theoretically achieve the same alignment as backpropagation. Finally, we test these learning rules on benchmark problems in feedforward and recurrent neural networks and demonstrate, in wide networks, comparable performance to backpropagation.Comment: 22 pages, 11 figure

arXiv.org e-Print Archive

CNN-Cert: An Efficient Framework for Certifying Robustness of Convolutional Neural Networks

Author: Boopathy Akhilan
Chen Pin-Yu
Daniel Luca
Liu Sijia
Weng Tsui-Wei
Publication venue
Publication date: 29/11/2018
Field of study

Verifying robustness of neural network classifiers has attracted great interests and attention due to the success of deep neural networks and their unexpected vulnerability to adversarial perturbations. Although finding minimum adversarial distortion of neural networks (with ReLU activations) has been shown to be an NP-complete problem, obtaining a non-trivial lower bound of minimum distortion as a provable robustness guarantee is possible. However, most previous works only focused on simple fully-connected layers (multilayer perceptrons) and were limited to ReLU activations. This motivates us to propose a general and efficient framework, CNN-Cert, that is capable of certifying robustness on general convolutional neural networks. Our framework is general -- we can handle various architectures including convolutional layers, max-pooling layers, batch normalization layer, residual blocks, as well as general activation functions; our approach is efficient -- by exploiting the special structure of convolutional layers, we achieve up to 17 and 11 times of speed-up compared to the state-of-the-art certification algorithms (e.g. Fast-Lin, CROWN) and 366 times of speed-up compared to the dual-LP approach while our algorithm obtains similar or even better verification bounds. In addition, CNN-Cert generalizes state-of-the-art algorithms e.g. Fast-Lin and CROWN. We demonstrate by extensive experiments that our method outperforms state-of-the-art lower-bound-based certification algorithms in terms of both bound quality and speed.Comment: Accepted by AAAI 201

arXiv.org e-Print Archive

DSpace@MIT

Association for the Advancement of Artificial Intelligence: AAAI Publications

Model-agnostic Measure of Generalization Difficulty

Author: Boopathy Akhilan
Fiete Ila
Ge Shu
Hwang Jaedong
Liu Kevin
Mohammedsaleh Asaad
Publication venue
Publication date: 01/05/2023
Field of study

The measure of a machine learning algorithm is the difficulty of the tasks it can perform, and sufficiently difficult tasks are critical drivers of strong machine learning models. However, quantifying the generalization difficulty of machine learning benchmarks has remained challenging. We propose what is to our knowledge the first model-agnostic measure of the inherent generalization difficulty of tasks. Our inductive bias complexity measure quantifies the total information required to generalize well on a task minus the information provided by the data. It does so by measuring the fractional volume occupied by hypotheses that generalize on a task given that they fit the training data. It scales exponentially with the intrinsic dimensionality of the space over which the model must generalize but only polynomially in resolution per dimension, showing that tasks which require generalizing over many dimensions are drastically more difficult than tasks involving more detail in fewer dimensions. Our measure can be applied to compute and compare supervised learning, reinforcement learning and meta-learning generalization difficulties against each other. We show that applied empirically, it formally quantifies intuitively expected trends, e.g. that in terms of required inductive bias, MNIST < CIFAR10 < Imagenet and fully observable Markov decision processes (MDPs) < partially observable MDPs. Further, we show that classification of complex images

<

few-shot meta-learning with simple images. Our measure provides a quantitative metric to guide the construction of more complex tasks requiring greater inductive bias, and thereby encourages the development of more sophisticated architectures and learning algorithms with more powerful generalization capabilities.Comment: Accepted at ICML 2023, 28 pages, 6 figure

arXiv.org e-Print Archive

Neuro-Inspired Fragmentation and Recall to Overcome Catastrophic Forgetting in Curiosity

Author: Agrawal Pulkit
Boopathy Akhilan
Chen Eric
Fiete Ila
Hong Zhang-Wei
Hwang Jaedong
Publication venue
Publication date: 26/10/2023
Field of study

Deep reinforcement learning methods exhibit impressive performance on a range of tasks but still struggle on hard exploration tasks in large environments with sparse rewards. To address this, intrinsic rewards can be generated using forward model prediction errors that decrease as the environment becomes known, and incentivize an agent to explore novel states. While prediction-based intrinsic rewards can help agents solve hard exploration tasks, they can suffer from catastrophic forgetting and actually increase at visited states. We first examine the conditions and causes of catastrophic forgetting in grid world environments. We then propose a new method FARCuriosity, inspired by how humans and animals learn. The method depends on fragmentation and recall: an agent fragments an environment based on surprisal, and uses different local curiosity modules (prediction-based intrinsic reward functions) for each fragment so that modules are not trained on the entire environment. At each fragmentation event, the agent stores the current module in long-term memory (LTM) and either initializes a new module or recalls a previously stored module based on its match with the current state. With fragmentation and recall, FARCuriosity achieves less forgetting and better overall performance in games with varied and heterogeneous environments in the Atari benchmark suite of tasks. Thus, this work highlights the problem of catastrophic forgetting in prediction-based curiosity methods and proposes a solution.Comment: NeurIPS 2023 Workshop - Intrinsically Motivated Open-ended Learnin

arXiv.org e-Print Archive

Towards More Generalizable Neural Networks via Modularity

Author: Boopathy Akhilan
Publication venue: Massachusetts Institute of Technology
Publication date: 21/06/2022
Field of study

Artificial neural networks have become highly effective at performing specific, challenging tasks by leveraging a large amount of training data. However, they are unable to generalize to diverse, unseen domains without requiring significant retraining. This thesis quantifies the generalization difficulty of a task as the amount of information content in the inductive biases required to solve a task, and demonstrates that generalization difficulty relies crucially on the number of dimensions of generalization. Inspired by the modularity of biological learning systems, this thesis then demonstrates theoretically and empirically that modularity promotes generalization by providing a powerful inductive bias. Finally, the thesis proposes a new challenging spatial navigation benchmark that requires a broad degree of generalization from a small amount of training data. This benchmark is presented as a test of the generalization capability of learning algorithms; based on the results of this thesis, modularity is expected to promote generalization on this benchmark.S.M

DSpace@MIT

CNN-Cert: An Efficient Framework for Certifying Robustness of Convolutional Neural Networks

Author: Boopathy Akhilan
Chen Pin-Yu
Daniel Luca
Liu Sijia
Weng Tsui-Wei
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date: 17/07/2019
Field of study

Verifying robustness of neural network classifiers has attracted great interests and attention due to the success of deep neural networks and their unexpected vulnerability to adversarial perturbations. Although finding minimum adversarial distortion of neural networks (with ReLU activations) has been shown to be an NP-complete problem, obtaining a non-trivial lower bound of minimum distortion as a provable robustness guarantee is possible. However, most previous works only focused on simple fully-connected layers (multilayer perceptrons) and were limited to ReLU activations. This motivates us to propose a general and efficient framework, CNN-Cert, that is capable of certifying robustness on general convolutional neural networks. Our framework is general - we can handle various architectures including convolutional layers, max-pooling layers, batch normalization layer, residual blocks, as well as general activation functions; our approach is efficient - by exploiting the special structure of convolutional layers, we achieve up to 17 and 11 times of speed-up compared to the state-of-the-art certification algorithms (e.g. Fast-Lin, CROWN) and 366 times of speed-up compared to the dual-LP approach while our algorithm obtains similar or even better verification bounds. In addition, CNN-Cert generalizes state-of-the-art algorithms e.g. Fast-Lin and CROWN. We demonstrate by extensive experiments that our method outperforms state-of-the-art lower-bound-based certification algorithms in terms of both bound quality and speed

DSpace@MIT

Association for the Advancement of Artificial Intelligence: AAAI Publications

Fast Training of Provably Robust Neural Networks by SingleProp

Author: Boopathy Akhilan
Chen Pin-Yu
Daniel Luca
Liu Sijia
Weng Tsui-Wei
Zhang Gaoyuan
Publication venue
Publication date: 01/02/2021
Field of study

Recent works have developed several methods of defending neural networks against adversarial attacks with certified guarantees. However, these techniques can be computationally costly due to the use of certification during training. We develop a new regularizer that is both more efficient than existing certified defenses, requiring only one additional forward propagation through a network, and can be used to train networks with similar certified accuracy. Through experiments on MNIST and CIFAR-10 we demonstrate improvements in training speed and comparable certified accuracy compared to state-of-the-art certified defenses.Comment: Published at AAAI 202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Fast Training of Provably Robust Neural Networks by SingleProp

Author: Boopathy Akhilan
Chen Pin-Yu
Daniel Luca
Intelligence Assoc Advancement Artificial
Liu Sijia
Weng Lily
Zhang Gaoyuan
Publication venue
Publication date: 13/06/2022
Field of study

DSpace@MIT

Marble: Model-based robustness analysis of stateful deep learning systems

Author: Bastani Osbert
Biggio Battista
Boopathy Akhilan
Carlini Nicholas
Carlini Nicholas
Chen Guangke
Cheng Minhao
Cisse Moustapha
Cousot Patrick
Cousot Patrick
Ebrahimi Javid
Eykholt Kevin
Gao Ji
Gehr Timon
Gill Arthur
Goodfellow Ian
Hill Theodore P.
Hinton Geoffrey
Hu Qiang
Jia Robin
Katz Guy
Ko Ching-Yun
Lehmann Erich L
Li Jinfeng
Li Jiwei
Li Xin
Ma Lei
Maas Andrew L
Mann Henry B
Papernot Nicolas
Pei Kexin
Pennington Jeffrey
Puterman Martin L.
Rastogi Pushpendre
Silver David
Sriperumbudur Bharath K.
Sriperumbudur Bharath K.
Szegedy Christian
Wachter Bjorn
Weng Tsui-Wei
Weng Tsui-Wei
Wold Svante
Zhao Zhengli
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/09/2020
Field of study

Crossref

Institutional Knowledge at Singapore Management University